Subspace-constrained partial update methods for reduced-complexity signal estimation, parameter estimation, or data dimensionality reduction

ABSTRACT

An adaptive processor implements partial updates when it adjusts weights to optimize adaptation criteria in signal estimation, parameter estimation, or data dimensionality reduction algorithms. The adaptive processor designates some of the weights to be update weights and the other weights to be held weights. Unconstrained updates are performed on the update weights, whereas updates to the set of held weights are performed within a reduced-dimensionality subspace. Updates to the held weights and the update weights employ adapt-path operations for tuning the adaptive processor to process signal data during or after tuning.

CROSS REFERENCE TO RELATED APPLICATIONS

This application is a Continuation of U.S. patent application Ser. No.15/883,359, filed Jan. 30, 2018, now U.S. Pat. No. 10,902,086, which isa Continuation of U.S. patent application Ser. No. 14/121,895, filedNov. 1, 2014, now U.S. Pat. No. 9,928,212, which claims priority to U.S.Provisional Patent Application No. 61/962,269, filed Nov. 3, 2013, allof which are expressly incorporated by reference herein in theirentirety and for all purposes.

FIELD OF THE INVENTION

The present invention relates generally to digital signal processing,with particular emphasis on devices employing one or more adaptiveprocessors with large numbers of adaptation weights (also known ashigh-dimensionally, or highly-adaptive signal processors). Althoughreferred to generally as “an adaptive signal processor” or as “anadaptive digital signal processor” (‘digital’ being generally understoodto refer to the nature of the processing used by the computationalaspect, while the signals are generally understood to be analogelectromagnetic waveforms), this phrase covers any singleton orcombination device (i.e., whether the ‘processor’ comprises a singleelement, or a non-zero set of interacting elements), and whether thedevice's digital processing aspect is entirely embodied in physicalhardware, or in a combined form of hardware, special-purpose firmware,and general processing purpose software. The term ‘adaptive’ refers toprocessing that adjusts signal weights to the physical signal(s)transmitted, received, or both, by or through said adaptive processor,in order to optimize an adaptation criteria responsive to a functionalpurpose or the externalities (transient, temporary, situational, andeven permanent) for that processor. Each adaptation criteria for theadaptive algorithm may be any of a signal or parameter estimation,measured quality, or any combination thereof.

BACKGROUND OF THE INVENTION

Highly dimensional adaptive processors (devices employing adaptiveprocessors with large numbers of adaptation weights or parameters) areof interest for a wide variety of applications. These applicationsinclude:

-   -   Acoustic echo cancellers, where adaptive noise cancellers        employing finite impulse response (FIR) filters with as many as        2,000 adaptively adjust filter taps are used to remove echoes        induced in long-haul telephony networks.    -   Phased array and MIMO radar systems, where large arrays of        antennas (10-1,000 elements/array) are used to electronically        steer beams at detected targets and nulls at jammers and clutter        sources, by combining signals received by the array and        distributing signals transmitting to the array using large        linear matrix operations.    -   Digital predistortion (DPD) processors, where nonlinear adaptive        processors with large numbers of parameters (e.g.,        Volterra-series approximations of nonlinear processes) are used        to adaptively learn, and digitally invert nonlinear effects        added by high-power amplifiers.    -   Smart Grid networks employing spread spectrum modulation formats        with large spreading factors and adaptive dispreading methods to        separate large numbers of co-channel signals, and to detect and        remove spoofers from the networks.    -   Massively MIMO cellular networks employing base stations with        very large numbers of antenna arrays.

To effect adaptive signal processing in these applications, practicalmeans for adjusting large numbers of weights must be developed andimplemented. Techniques that have been developed in past to accomplishthis include nonblind techniques that exploit a known reference signal(e.g., a training or pilot signal inserted into a signal transmitted tothe adaptive processor); “partially blind” techniques that exploit aknown reference signal with unknown effects added by the communicationchannel, e.g., delay caused by clock timing offset and physical distancebetween the transmitter and receiver, and carrier offset caused by LOoffset and Doppler shift between the transmitter and receiver; and fullyblind methods that only exploit general structure of the transmittedsignal. In many systems, a reference signal can only be made availableon a sparse basis, e.g., at the beginning of signal reception, afterwhich the processor must operate using fixed weights without additionaltraining between reference signal reception intervals.

These techniques can also be subdivided into methods with “order-M”(O(M)) or linear complexity, where the real multiply-and-accumulate(RMAC) operations per input data sample needed to adapt the processor ison the order of the number of weights M being adjusted by the processor,and methods with higher-order (e.g., O(M^(ν)), where ν>1) complexity,where the RMAC's per data sample needed to adapt the processor risesmuch faster than the number of weights being adjusted by the processor.Typically, the most powerful and effective adaptive processing methodshave complexity of high order. This presents significant challenges inapplications where the number of adaptation weights M is very large.

Lastly, these techniques can be subdivided into sample-processingmethods, where the processor weights are adapted every time a new inputdata sample is provided to the processor, and block-processing methods,where a block of input data is received and used to adapt the processor.In some cases, the algorithm may circulate through the data blockmultiple times before moving onto the next processing block. Again, themore powerful and effective adaptive processing methods employ blockprocessing, typically with a block size N that is (in many cases, mustbe) a large multiple of M. However, the cost of this processing isreduced update rate; reduced response time to changes in channel effectsaffecting the adaptive processor; and (e.g., for multiple passes throughthe data block) additional increase in complexity.

It should also be noted that the operations referred to above are the“adapt-path” operations used to train the adaptive processor, not the“data-path” operations used to implement the adaptive processor duringand after training. Adapt-path operations are used to tune the adaptiveprocessor used to process a set of signals, while data-path operationsare used to process a set of signals during and after tuning. For mostof the applications described above (the DPD application being a notableexception), the data-path operations have O(M) complexity, regardless ofthe complexity of the adapt path.

To address the adapt-path complexity issue in particular, the concept ofa partial update (PU) method (PUM; in the plural, PUMs) that onlyupdates a subset of M₁ weights during each adaptation block or sample(referred to hereafter as a block with size N=1) has been proposed for anumber of applications. All PUMs developed to date can be interpreted aslinearly-constrained optimization techniques, in which the originalmethod is adjusted by applying a hard linear constraint that forcesM₀=M−M₁ weights to remain at the same value between adaptation blocks orsamples. The subset of weights actually adapted during each data block,or during each of several passes through a data block, are changedduring each adaptation event, so that every weight is updated over thecourse of multiple adaptation events.

This approach has substantive limitations in practice. First, the linearconstraint, by its nature, can induce severe misadjustment from theoptimal solution sought by the processor. This can manifest as either orboth a convergent or steady-state bias from the optimal solution, and a“jitter” or fluctuation about that steady-state solution. In someapplications, e.g., phased array radar applications where the receivedradar waveform must be extracted from strong clutter and jamming, thiscan cause the system to fail entirely (studies of PUMs showing“convergence-in-mean” to optimal solutions are almost always conductedunder assumptions of little-or-no noise and removable multipathdistortion). Even if the processor signal of interest is received athigh signal-to-interference-and-noise ratio (SINR), this can lead towell-known “hypersensitivity” issues which degrades the systemperformance from the optimal solution.

Second, the linear optimization constraint can only be easily added to asmall subset of O(M²) optimization functions, e.g., “least-squares (LS)”or LS-like methods that can be formulated as a quadratic optimizationproblem, or O(M) “least-mean-squares (LMS)” or LMS-like methods that areeither intended to approximate LS optimization algorithms (e.g., byreplacing gradients with “stochastic gradient” approximations), or thatcan themselves be formulated as linearly constrained quadraticoptimization problems (e.g., “normalized LMS (NLMS)” and “AffineProjections” algorithms). In many cases, adherence to the constraintsignificantly increases complexity of the original method, andapproximations, e.g., using Lagrange multipliers in which the multiplieritself is added to the algorithm, only increases the misadjustment ofthe algorithm.

In summary, the current PUMs developed to data can only be used with asmall number of O(M²) methods, and cannot be used with any O(M^(ν))methods where ν>2. This is particularly unfortunate, because the PUMshould have its strongest utility with these classes of methods. This isespecially evident when the complexity of the data-path processing,which as noted above is typically O(M), is added to the adapt-pathprocessing: at best for O(M) adapt-path methods, the PUM will onlyreduce overall complexity by 50%. This is the background in which thepresent invention takes form.

SUMMARY OF THE INVENTION

The present invention is a method for implementing partial-updatemethods (PUMs) in any adaptive processor that adjusts weights tooptimize an adaptation criterion in a signal estimation or parameterestimation algorithm. In the preferred embodiment, the method does thisby performing a linear transformation of the processor parameters beingadapted from M-dimensions to (M₁+L)-dimensions in each adaptation event,where M₁ weights are updated without constraints, and M₀=M−M₁ weightsare subjected to L soft constraints that forces them into anL-dimensional subspace spanned by those weights (preferentially, thoseweights are a scaled replica of the original weights) at the beginningof the adaptation event, and where M₁ and L are much smaller than M (M₁«M and L«M). Preferentially, L is equal to unity (L=1), i.e., the M₀constrained weights are forced into a single-dimensional subspacespanned by those weights.

The same dimensionality reduction is also applied to the input data,using the same linear transformation. The reduced-dimensionality weightsare then adapted using exactly the same optimization strategy employedby the adaptive processor, except with input data that has also beenreduced in dimensionality.

BRIEF DESCRIPTION OF THE DRAWINGS

The present invention is illustrated in the attached drawings asdescribed herein.

FIG. 1 is a view of an optimization approach used in a system employinga prior-art nonblind single-port unconstrained adaptation algorithm.

FIG. 2 is a view of an optimization approach used in a system employinga prior-art nonblind single-port partial-update (PU) adaptationalgorithm.

FIG. 3 is a view of an optimization approach used in a system employinga nonblind single-port subspace-constrained partial update (SCPU)adaptation algorithm.

FIG. 4 is a view of a nonblind single-port SCPU adapt-path weight updateprocedure, depicting use of projection matrices (implemented usingsimple multiplexing and demultiplexing operations) to separate data andweights into unconstrained and subspace-constrained components, allowinguse of an unconstrained single-port weight adaptation algorithm ofarbitrary type and structure after the subspace separation procedure.

FIG. 5 is a view of a nonblind uncoupled multiport SCPU adapt-pathweight update procedure, depicting use of projection matrices(implemented using simple multiplexing and demultiplexing operations) toseparate data and weights into unconstrained and subspace-constrainedcomponents, allowing use of parallel banks of reduced-complexityunconstrained single-port weight adaptation algorithms of arbitrary typeand structure after the subspace separation procedure.

FIG. 6 is a view of a nonblind fully-coupled multiport SCPU adapt-pathweight update procedure, depicting use of projection matrices(implemented using simple multiplexing and demultiplexing operations) toseparate data and weights into unconstrained and subspace-constrainedcomponents, allowing use of an unconstrained multiport weight adaptationalgorithm of arbitrary type and structure after the subspace separationprocedure.

DETAILED DESCRIPTION OF THE DRAWINGS

FIG. 1 is a view of an optimization approach used in a system employinga prior-art nonblind single-port unconstrained adaptation algorithm. Ona first path, a vector processor [1] provides a sequence of data vectorsx(n_(sym))=[x₁(n_(sym)) . . . x_(M) (n_(sym))]^(T), each data vectorhaving dimension M×1, where n_(sym) is a symbol index, and where M is areal positive integer, referred to here as the degrees-of-freedom(DoF's) of the system, and (⋅)^(T) denotes the matrix transposeoperation. As part of a data-path processing procedure, the data vectorsequence x(n_(sym)) is then passed through a linear combiner [3] thatperforms a matrix multiplication of x(n_(sym)) by a weight vector w=[w₁. . . w_(M)]^(T) having dimension M×1, resulting in an output datascalar y(n_(sym))=x^(T)(n_(sym))w having dimension N×1.

As part of an adapt-path processing procedure that is a focus of theinvention, the data vector sequence x(n_(sym)) is also passed into abank of M 1:N serial-to-parallel (S/P) convertors [2] that converts thevector data sequence into a sequence of data matrices X(n)=[x(nN+1) . .. x(nN+N)]^(T), each data matrix having dimension N×M, where N is areal-positive integer, referred to here as the block length of theadaptation algorithm, and n is an adapt block index.

On a second path, and also as part of an adapt-path processingprocedure, a sequence of reference scalars s(n_(sym)) is provided by areference generator [4], each reference scalar having dimension 1×1. Inthe nonblind adaptation algorithm shown in FIG. 1, the reference scalarss(n_(sym)) are known at the receiver, and are correlated with somecomponent of the data vector x(n_(sym)) in some known manner; however,in other system implementations, the reference scalars may be members ofa set of possible known received signal components, or may be derivedfrom the output data vector in some manner.

The reference scalars are then passed into a single 1:Nserial-to-parallel (S/P) convertor [5] that converts the scalar symbolsequence into a sequence of reference vectors s(n), each vectorreference data symbol having dimension N×1. The reference vector s(n) isthen compared with the data matrix X(n) (from the bank of M 1:Nserial-to-parallel converters [2]) over each adapt block, and used togenerate a weight vector w using an unconstrained adaptation algorithm[6] that adjusts every element of w to optimize a metric of similaritybetween the output data vector y(n)=X(n)w and the reference vector s(n),e.g., the sum-of-squares error metric F(w;n)=∥s(n)−X(n)w∥₂ ², where ∥⋅∥₂denotes the L2 vector norm. The weights are then passed to the data-pathprocessor [3], where they are used to process the input data vectors ona symbol-by-symbol basis.

It should be noted that the data matrices and reference vectors do notneed to be contiguous, internally or between adapt blocks on theadapt-paths. However, the input data matrices and reference vectorsshould have internally consistent symbol indices.

FIG. 2 is a view of an optimization approach used in a system employinga prior-art nonblind single-port partial-update (PU) adaptationalgorithm. On a first path, a vector processor [1] provides a sequenceof data vectors x(n_(sym))=[x₁(n_(sym)) . . . x_(M)(n_(sym))]^(T), eachdata vector having dimension M×1, where n_(sym) is a symbol index, andwhere M is a real positive integer, referred to here as thedegrees-of-freedom (DoF's) of the system, and (⋅)^(T) denotes the matrixtranspose operation. As part of a data-path processing procedure, thedata vector sequence x(n_(sym)) is then passed through a linear combiner[3] that performs a matrix multiplication of x(n_(sym)) by a weightvector w=[w₁ . . . w_(M)]^(T) having dimension M×1, resulting in anoutput data scalar y(n_(sym))=x^(T)(n_(sym))w having dimension N×1.

As part of an adapt-path processing procedure that is a focus of theinvention, the data vector sequence x(n_(sym)) is also passed into abank of M 1:N serial-to-parallel (S/P) convertors [2] that converts thevector data sequence into a sequence of data matrices X(n)=[x(nN+1) . .. x(nN+N)]^(T), each data matrix having dimension N×M, where N is areal-positive integer, referred to here as the block length of theadaptation algorithm, and n is an adapt block index.

On a second path, and also as part of an adapt-path processingprocedure, a sequence of reference scalars s(n_(sym)) is provided by areference generator [4], each reference scalar having dimension 1×1. Inthe nonblind adaptation algorithm shown in FIG. 2, the reference scalarss(n_(sym)) are known at the receiver, and are correlated with somecomponent of the data vector x(n_(sym)) in some known manner; however,in other system implementations, the reference scalars may be members ofa set of possible known received signal components, or may be derivedfrom the output data vector in some manner.

The reference scalars are then passed into a single 1:Nserial-to-parallel (S/P) convertor [5] that converts the scalar symbolsequence into a sequence of reference vectors s(n), each vectorreference data symbol having dimension N×1.

On a third path, and also as part of an adapt-path processing procedure,an update-set selection algorithm [7] is used to generate a sequence ofM₁-element update-sets

₁(n)+{m∈{1, . . . , M}: m(1;n), . . . , m(M₁;n)} and complementaryM₀-element held-sets

₀(n)={m∈{1, . . . , M}: m∉

₁(n)} over each adapt block, such that M₀=M−M₁,

₀(n)∪

₁(n)={1, . . . , M}, and

₀(n)∩

₁(n)={ } within adapt block n. The set selection strategy can beadjusted using deterministic, random, pseudo-random, or data-derivedmethods. In the partial-update optimization approach shown in FIG. 2,the update-set and held-set {

₁(n),

₀(n)} are further used to generate update-set and held-set projectionmatrices [9] {M₁(n),M₀(n)}, where

(n)=[e_(M)

)

for

=0, 1, and where e_(M)

=[δ(m−

)]_(m=1) ^(M) is the

M×1 Euclidean basis vector and δ(k) is the Kronecker delta function.

The reference vector s(n) is then compared with the data matrix X(n)over each adapt block, and used to generate a weight vector w using ahard-constrained adaptation algorithm [8] that adjusts only the elementsof w in the update-set, i.e.,

_((n)), to optimize a metric of similarity between the output datavector y(n)=X(n)w and the reference vector s(n), e.g., thesum-of-squares error metric F(w;n)=∥s(n)−X(n)w∥₂ ², while holding theelements of w in the held-set, i.e.,

_((n)), equal to the same values held by those weight elements over theprevious adapt block. This can be expressed as optimization of existingweight vector w to form new weight vector w′, subject to hard linearconstraint M₀ ^(T)(n)w′=M₀ ^(T)(n)w. The weights are then passed to thedata-path processor [3], where they are used to process the input datavectors on a symbol-by-symbol basis.

It should be noted that the data matrices and reference vectors do notneed to be contiguous, internally or between adapt blocks on theadapt-paths. However, the input data matrices and reference vectorsshould have internally consistent symbol indices.

FIG. 3 is a view of an optimization approach used in a system employinga new nonblind single-port subspace-constrained partial-update (SCPU)adaptation algorithm. On a first path, a vector processor [1] provides asequence of data vectors x(n_(sym))=[x₁(n_(sym)) . . .x_(M)(n_(sym))]^(T), each data vector having dimension M×1, wheren_(sym) is a symbol index, and where M is a real) positive integer,referred to here as the degrees-of-freedom (DoF's) of the system, and(⋅)^(T) denotes the matrix transpose operation. As part of a data-pathprocessing procedure, the data vector sequence x(n_(sym)) is then passedthrough a linear combiner [3] that performs a matrix multiplication ofx(n_(sym)) by a weight vector w=[w₁ . . . w_(M)]^(T) having dimensionM×1, resulting in an output data scalar y(n_(sym))=x^(T)(n_(sym))whaving dimension N×1.

As part of an adapt-path processing procedure that is a focus of theinvention, the data vector sequence x(n_(sym)) is also passed into abank of M 1:N serial-to-parallel (S/P) convertors [2] that converts thevector data sequence into a sequence of data matrices X(n)=[x(nN+1) . .. x(nN+N)]^(T), each data matrix having dimension N×M, where N is areal-positive integer, referred to here as the block length of theadaptation algorithm, and n is an adapt block index.

On a second path, a sequence of reference scalars s(n_(sym)) is providedby a reference generator [4], each reference scalar having dimension1×1. In the nonblind adaptation algorithm shown in FIG. 3, the referencescalars s(n_(sym)) are known at the receiver, and are correlated withsome component of the data vector x(n_(sym)) in some known manner;however, in other system implementations, the reference scalars may bemembers of a set of possible known received signal components, or may bederived from the output data vector in some manner.

The reference scalars are then passed into a single 1:Nserial-to-parallel (S/P) convertor [5] that converts the scalar symbolsequence into a sequence of reference vectors s(n), each vectorreference data symbol having dimension N×1.

On a third path, an update-set selection algorithm [7] is used togenerate a sequence of M₁-element update-sets

₁(n)={m∈{1, . . . , M}: M(1;n), . . . , m(M₁;n)} and complementaryM₀-element held-sets

₀(n)={∈{1, . . . , M}: m∉

₁(n)} over each adapt block, such that M₀=M−M₁,

₀(n)∪

₁(n)={1, . . . , M}, and

₀(n)∩

₁(n)={ } within adapt block n. The set selection strategy can beadjusted using deterministic, random, pseudo-random, or data-derivedmethods. In the partial-update optimization approach shown in FIG. 3,the update-set and held-set {

₁(n),

₀(n)} are further used to generate update-set and held-set projectionmatrices [9] {M₁(n),M₀(n)}, where

(n)=[e_(M)

_((n)) for

=0, 1, and where e_(M)(

)=[δ(m−

)]_(m=1) ^(M) is the

M×1 Euclidean basis vector and δ(k) is the Kronecker delta function.

The reference vector s(n) is then compared with the data matrix X(n)over each adapt block, and used to generate a weight vector w using asubspace-constrained adaptation algorithm [10] that adjusts the elementsof w in the update-set, i.e.,

_((n)), to optimize a metric of similarity between the output datavector y(n)=X(n)w and the reference vector s(n), e.g., thesum-of-squares error metric F(w;n)=∥s(n)−X(n)w∥₂ ², while optimizing theelements of w in the held-set, i.e.,

_((n)) to a scalar multiple of the values held by those weight elementsover the previous adapt block. This can be expressed as optimization ofexisting weight vector w to form new weight vector w′, subject tosubspace constraint M₀ ^(T)(n)w′=g₀M₀ ^(T)(n)w, where g₀ is an unknownscalar that is also optimized by the algorithm. The weights are thenpassed to the data-path processor [3], where they are used to processthe input data vectors on a symbol-by-symbol basis.

It should be noted that the data matrices and reference vectors do notneed to be contiguous, internally or between adapt blocks on theadapt-paths. However, the input data matrices and reference vectorsshould have internally consistent symbol indices.

FIG. 4 is a view of a nonblind, single-port, SCPU adapt-path weightupdate method, depicting use of projection matrices (implemented usingsimple multiplexing and demultiplexing operations) to separate data andweights into unconstrained and subspace-constrained components, allowinguse of an unconstrained single-port weight adaptation algorithm ofarbitrary type and structure after the subspace separation procedure.Over adapt block n, the N×M data matrix X(n) provided by the 1:N S/Pbank [2] (not shown) is separated into an N×M₁ dimensional update-setdata matrix X₁(n)=X(n)M₁(n) and an N×M₀ dimensional held-set data matrixX₀(n)=X(n)M₀(n), using a columnar matrix demultiplexer (DMX) [11] andthe update-set and held-set projection matrices provided over adaptblock n [7].

An M×M₀ dimensional held-set projection matrix M₀(n) is additionallyused to extract the M₀×1 dimensional held-set combiner weights w₀=M₀^(T)(n)w from the M×1 dimensional combiner weights w stored in currentmemory [12], e.g., computed in prior adapt blocks, using a held-setweight extractor [13]. These held-set combiner weights w₀ are used tomultiply the held-set data matrix X₀(n) from the columnar matrixdemultiplexer (DMX) [11] through a linear combiner [14], yielding a N×1held-set output data vector y₀(n)=X₀(n)w₀. X₁(n) and y₀(n) are thencombined into an N×(M₁+1) enhanced data matrix X(n)=[X₁(n) y₀(n)] usinga column-wise multiplexing (MUX) operation [15].

The enhanced data matrix X(n) is then input to an unconstrained weightadaptation algorithm [16] that adjusts every element of an (M₁+1)×1enhanced combiner vector

$\overset{\sim}{w} = \begin{pmatrix}w_{1} \\g_{0}\end{pmatrix}$

to optimize a metric of similarity between an N×1 reference vector s(n)provided by a reference generator [4] (not shown) and an N×1 output datavector y(n)={tilde over (X)}(n){tilde over (w)} that would be providedby an (M₁+1)-element linear combining operation (not shown). Theunconstrained weight adaptation algorithm [16] optimizes the same metricas the unconstrained weight adaptation algorithm [6] depicted in priorart FIG. 1, e.g., the sum-of-squares error metric F(w;n)=∥s(n)−{tildeover (x)}(n){tilde over (w)}∥₂ ². However, the complexity of theunconstrained weight adaptation algorithm is O(M₁+1)^(ν) in [16], ratherthan O(M^(ν)) in [6], where ν is the complexity order of the algorithm,e.g., ν=2 if a sum-of-squares metric is used in both Figures.

The updated (M₁+1)×1 enhanced combiner vector {tilde over (w)} is thendemultiplexed (DMX′d) [17] into an updated M₁×1 update-set weight vectorw₁ comprising the first M₁ elements of w, and a new held-set scalarmultiplier g₀ comprising the last element of {tilde over (w)}. Theheld-set scalar multiplier g₀ is then multiplied by the current M₀×1held-set weights w₀[18] to form updated held-set weights w₀←w₀g₀, andmultiplexed (MUX′d) [19] with the updated M₁×1 update-set weight vectorw₁, in accordance with the current update-set selection algorithm [7],to form updated M×1 dimensional weight vector w=M₁(n)w₁+M₀(n)w₀. Thisweight vector is then stored in memory [12], allowing its use as aninitial combiner weight vector in a subsequent adapt block. The weightvector can also be used in the data-path linear combiner (not shown) forparallel or subsequent data-path processing operations used in theoverall system.

FIG. 5 is a view of a nonblind, multiport, uncoupled SCPU adapt-pathweight update method, depicting use of projection matrices (implementingusing simple multiplexing and demultiplexing operations) to separatedata and weights into unconstrained and subspace-constrained components,allowing use of parallel banks of reduced-complexity and unconstrained,single-port. weight adaptation algorithms of arbitrary type andstructure after the subspace separation step. Over adapt block n, theN×M data matrix X(n) provided by the 1:N S/P bank [2] (not shown) isseparated into an N×M₁ dimensional update-set data matrixX₁(n)=X(n)M₁(n) and an N×M₀ dimensional held-set data matrixX₀(n)=X(n)M₀(n), using a columnar matrix demultiplexer (DMX) [11] andthe update-set and held-set projection matrices provided over adaptblock n from the update-set selection algorithm [7]. On each output portp, where p=1, . . . , P where P is the number of weight ports adapted bythe overall algorithm, the M×M₀ dimensional held-set projection matrixM₀(n) is additionally used to extract the M₀×1 dimensional port pheld-set combiner weights w₀(p)=M₀ ^(T)(n)w(p) from the M×1 dimensionalport p combiner weight vector w(p) stored in current memory [20], e.g.,computed over a prior adapt block, using a held-set weight extractor[13]. The held-set data matrix X₀(n) is then multiplied by the port pheld-set combiner weights w₀(p) from the held-set weight extractor [13]through a linear combiner [14], yielding N×1 w₀(p) held-set output datavector y₀(n; p)=X₀(n)w₀(p) for each port p. X₁(n) and y₀(n; p) are thencombined into an N×(M₁+1) enhanced port p data matrix {tilde over(X)}(n;p)=[X₁(n) y₀(n;p)] using a column-wise multiplexing (MUX)operation [15].

The port p enhanced data matrix is then input to an unconstrained weightadaptation algorithm [21] that adjusts every element of an (M₁+1)×1 portp enhanced combiner vector

${\overset{\sim}{w}(p)} = \begin{pmatrix}{w_{1}(p)} \\{g_{0}(p)}\end{pmatrix}$

to optimize a metric of similarity between an N×1 port p referencevector s(n;p) provided by a reference generator [4] (not shown) and anN×1 output data vector y(n;p)={tilde over (X)}(n;p){tilde over (w)}(p)that would be provided by an (M₁+1)-element linear combining operation(not shown). The unconstrained weight adaptation algorithm [21]optimizes the same metric as the unconstrained weight adaptationalgorithm [6] depicted in prior art FIG. 1, e.g., the sum-of-squareserror metric F(w;n,p)=∥s(n;p)−{tilde over (X)}(n;p){tilde over (w)}∥₂ ².However, the complexity of the unconstrained weight adaptation algorithmis O(M₁+1)^(ν)) in [21], rather than O(M^(ν)) in [6], where ν is thecomplexity order of the algorithm, e.g., ν=2 if a sum-of-squares metricis used in both Figures. Additionally, the weight adaptation algorithmcan exploit commonality between the enhanced data matrices {{tilde over(X)}(n;p)}_(p=1) ^(P), i.e., the common update-set data matrix X₁(n)contained within each enhanced data matrix, to share results ofoperations on X₁(n) performed for each algorithm port, resulting in anadditional reduction on computational complexity for the overallmultiport processor. The updated (M₁+1)×1 port p enhanced combinervector {tilde over (w)}(p) is then demultiplexed (DMX′d) [17] into anupdated M₁×1 port p update-set weight vector w₁(p) comprising the firstM₁ elements of {tilde over (w)}(p), and a new port p held-set scalarmultiplier g₀ (p) comprising the last element of {tilde over (w)}(p).The held-set scalar multiplier g₀ (p) is then multiplied by the current(from the held-set weight extractor [13])M₀×1 port p held-set weightsw₀(p) [18] to form updated port p held-set weights w₀(p)←w₀(p)g₀(p), andmultiplexed (MUX′d) [19] with the updated M₁×1 port p update-set weightvector w₁ (p), in accordance with the current update-set selectionalgorithm [7], to form updated M×1 dimensional port p weight vectorw(p)=M₁(n)w₁(p)+M₀(n)w₀(p). This weight vector is then stored in memory[20], allowing its use as an initial combiner weight vector in asubsequent adapt block. The weight vector can also be used in a port pdata-path linear combiner (not shown) for parallel or subsequentdata-path processing operations used in the overall system.

FIG. 6 is a view of a nonblind, multiport, fully-coupled SCPU adapt pathweight update method, depicting use of projection matrices (implementedusing simple multiplexing and demultiplexing operations) to separatedata and weights into unconstrained and subspace-constrained components,allowing use of an unconstrained multiport weight adaptation algorithmof arbitrary type and structure after the subspace separation procedure.Over adapt block n, the N×M data matrix X(n) provided by the 1:N S/Pbank [2] (not shown) is separated into an N×M₁ dimensional update-setdata matrix X₁(n)=X(n)M₁(n) and an N×M₀ dimensional held-set data matrixX₀(n)=X(n)M₀(n), using a columnar matrix demultiplexer (DMX) [11] andthe update-set and held-set projection matrices provided over adaptblock n from the update-set selection algorithm [7]. The M×M₀dimensional held-set projection matrix M₀(n) is additionally used toextract the M₀×P dimensional held-set combiner weights W₀=M₀ ^(T)(n)Wfrom the M×P dimensional combiner weight matrix W stored in currentmemory [20], e.g., computed over a prior adapt block, using a held-setmultiport weight extractor [22]. The held-set data matrix X₀(n) is thenmultiplied by the M₀×P held-set multiport combiner weights W₀ through alinear combiner [23], yielding N×P multiport held-set output data matrixY₀(n)=X₀(n)W₀. X₁(n) and Y₀(n) are then combined into an N×(M₁+P)dimensional enhanced data matrix X(n)=[X₁(n) Y₀(n)] using a column-wisemultiplexing (MUX) operation [24].

The enhanced data matrix is then input to an unconstrained multiportweight adaptation algorithm [25] that adjusts every element of (M₁+P)x Penhanced multiport combiner matrix

$\overset{\sim}{W}{(p) = \begin{pmatrix}W_{1} \\G_{0}\end{pmatrix}}$

to optimize a metric of similarity between an N×P reference vector S(n)provided by a multiport reference generator (not shown) and an N×Poutput data matrix Y(n)={tilde over (x)}(n){tilde over (W)} that wouldbe provided by an (M₁+P)×P element linear combining operation (notshown), e.g., the sum-of-squares error metric F({tilde over(W)};n)=∥S(n)−{tilde over (X)}(n)

,where

denotes the Frobenius matrix norm. However, the complexity of theunconstrained weight adaptation algorithm is O(P(M₁+P)^(ν) in [25],where ν is the complexity order of the algorithm, e.g., ν=2 if asum-of-squares metric is used to optimize {tilde over (W)}.

The updated (M₁+P)×P dimensional enhanced combiner matrix {tilde over(W)} is then demultiplexed (DMX′d) [26] into an updated M₁×P dimensionalupdate-set weight matrix W₁ comprising the first M₁ rows of {tilde over(W)}, and a new P×P dimensional held-set multiplier matrix G₀ comprisingthe last P rows of {tilde over (W)}. The held-set multiplier matrix G₀is then multiplied by the current M₀×P held-set weights W₀ [27] to formupdated held-set weights W₀←W₀G₀, and multiplexed (MUX′d) [28] with theupdated M₁×P update-set weight matrix W₁, in accordance with the currentupdate-set selection algorithm [7], to form updated M×P dimensionalweight matrix W=M₁(n)W₁+M₀(n)W₀. This weight matrix is then stored inmemory [20], allowing its use as an initial combiner weight matrix in asubsequent adapt block. The weight matrix can also be used in amultiport data-path linear combiner (not shown) for parallel orsubsequent data-path processing operations used in the overall system.

DETAILED DESCRIPTION OF EXEMPLARY EMBODIMENTS

A method for processing digital signals by any adaptive processor (as asingle element or set of interacting elements, and whether entirelyembodied in physical hardware or in a combined form of hardware,special-purpose firmware, and general processing purpose softwareapplied to effect digital signal processing) that adjusts signal weightson the digital signal(s) transmitted, received, or both, by or throughsaid adaptive processor, in order to optimize an adaptation criteriaresponsive to a functional purpose or the externalities (transient,temporary, situational, and even permanent) for that processor, isexplained. This adaptation criteria for the adaptive algorithm may beany of a signal or parameter estimation, measured quality, or anycombination thereof.

This method performs a linear transformation of the processor parametersbeing adapted from M-dimensions to (M₁+L)-dimensions in each adaptationevent, where M₁ «M and L «M such that M₁ weights are updated withoutconstraints and M₀=M M₁ weights are subjected to soft constraints thatforces them into an L-dimensional subspace spanned by those weights atthe beginning of the adaptation period. The same dimensionalityreduction, using the same linear transformation, is applied to the inputdata. The reduced-dimensionality weights are then adapted using the sameoptimization strategy employed by the adaptive processor, except withinput data that has also been reduced in dimensionality. In a preferredembodiment the reduced-dimensionality weights are then adapted usingexactly the same optimization strategy. In alternative embodiment, aswhen there exists any of hardware, software, or combined hardware andsoftware differentiation in “adapt-path” operations used to tune theadaptive processor and “data-path” operations used by the adaptiveprocessor during and after tuning, the method will be adapting thereduced-dimensionality weights using substantively the same optimizationstrategy employed by the adaptive processor for the input data to whichthe same dimensionality reduction has been applied.

The invention has numerous advantages over the conventional PU approach.These include:

-   -   Substantive reduction or elimination of misadjustment effects        induced by the hard linear constraint employed in the        conventional PU method.    -   Applicability to any optimization function, including functions        based on optimal (maximum-likelihood, maximum a priori,        minimum-mean-square) estimation strategies, and methods such as        analytic constant modulus algorithm (ACMA) and cumulant based        techniques that have very high-order complexity.    -   Ability to reduce adapt block size N significantly, e.g., to        N<M, even when the unconstrained approach experiences        instability issues at that block size.    -   Ability to develop optimization quality measures, e.g.,        Cramer-Rao bound on parameter or signal estimation performance,        that also exploits the dimensionality reduction, and that can        track the performance degradation (relative to the unconstrained        solution) induced by the partial update.    -   Ability to operate with much lower update set sizes than        conventional PU, resulting in further reduction in complexity,        and therefore cost, of adapt-path processing.    -   Ability to be implemented in highly distributed processing        architectures, e.g., general-purpose graphical processing units        (GPGPU's), that can further exploit the reduced complexity of        the approach, or allow processing over multiple parallel update        sets with minimal intercommunication between units.    -   Applicability to other problems where dimensionality is a known        limitation, e.g., pattern recognition over feature sets with        large numbers of parameters.

The approach can be used with any update-set selection strategydeveloped to date, or with new methods exploiting quality measurementadvantages of the approach.

The invention is motivated by interpreting prior-art partial-updateapproaches as hard-constrained optimization algorithms, in which acomplex combiner weight vector w having dimension M×1 is updated tooptimize metric F(w;n) over adapt block n, i.e.,

$\begin{matrix}{\left. w\leftarrow{\arg\underset{w^{\prime} \in {\mathbb{C}}^{M}}{opt}{F\left( {w^{\prime};n} \right)}} \right.,} & \left( {{Eq}\mspace{14mu} 1} \right)\end{matrix}$

subject to additional linear constraint

=

  (Eq2)

where

₀(n)={m∈{1, . . . , M}: m(1;n), . . . ,m(M₀;n)}, referred to here as theblock n held-set, is a set of M₀<M indices of weights held constant overadapt block n, and where w in (Eq2) is the combiner weights at thebeginning of the adapt block. This resultant constrained optimizationcriterion can be written in compact matrix algebra as

$\begin{matrix}\left. w\leftarrow{\arg\underset{w^{\prime} \in {\mathbb{C}}^{M}}{opt}\left\{ {{{F\left( {w^{\prime};n} \right)} \ni {{M_{0}(n)}w^{\prime}}} = {{M_{0}(n)}w}} \right\}} \right. & \left( {{Eq}\mspace{14mu} 3} \right)\end{matrix}$

where M₀(n)=[e_(m) (m₀)]

_((n)) is the M×M₀ sparse held-set projection matrix, and wheree_(M)(m₀)=[δ(m−m₀)]_(m=1) ^(M) is the m₀ ^(th) M×1 Euclidean basisvector and δ(k) is the Kronecker delta function. Example prior-artpartial-update algorithms that can be expressed in this manner include:

The partial-update normalized least-mean-squares (PU-NLMS) algorithm,which modifies the normalized least-mean-squares (NLMS) algorithm taughtin [Nagumo67]

$\begin{matrix}{{{y(n)} = {{x^{T}(n)}w}},} & \left( {{Eq}\mspace{14mu} 4} \right) \\{\left. w\leftarrow{w + {\mu\frac{x^{*}(n)}{{{x(n)}}_{2}^{2}}\left( {{s(n)} - {y(n)}} \right)}} \right.,{0 < \mu \leq 1}} & \left( {{Eq}\mspace{14mu} 5} \right) \\{{= {{{\arg{\min\limits_{w^{\prime} \in {\mathbb{C}}^{M}}{{w^{\prime} - w}}_{2}^{2}}} \ni {{x^{T}(n)}w^{\prime}}} = {\overset{\hat{}}{s}(n)}}},} & \left( {{Eq}\mspace{14mu} 6} \right) \\{{{\overset{\hat{}}{s}(n)} = {{\mu{s(n)}} + {\left( {1 - \mu} \right){y(n)}}}},{0 < \mu \leq 1}} & \left( {{Eq}\mspace{14mu} 7} \right)\end{matrix}$

at adapt symbol index n, where x(n) is an M×1 data vector defined overadapt symbol index n, s(n) is a reference scalar known over adapt symbolindex n_(sym), μ is the NLMS adaptive stepsize, and ∥⋅∥₂ is the L−2norm, and where (⋅)^(T) and (⋅)* denote the matrix transpose and complexconjugation operations, respectively. Addition of the held-setweight-update constraint

M ₀(n){tilde over (w)}′=M ₀(n)w,  (Eq8)

to (Eq6) yields the PU-NLMS algorithm taught in[Douglas94,Schertler98,Dogancay01],

y(n)=x ^(T)(n)w,  (Eq9)

=

(n)w,

=0,1  (Eq10)

(n)=

(n)x(n),

=0,1  (Eq11)

$\begin{matrix}{\left. w_{1}\leftarrow{w_{1} + {\mu\frac{x_{1}^{*}(n)}{{{x(n)}}_{2}^{2}}\left( {{s(n)} - {y(n)}} \right)}} \right.,{0 < \mu \leq 1},} & \left( {{Eq}\mspace{14mu} 12} \right) \\\left. w\leftarrow{{{M_{1}(n)}w_{1}} + {{M_{0}(n)}w_{0}}} \right. & \left( {{Eq}\mspace{14mu} 13} \right)\end{matrix}$

where M₁(n)=[e_(M)(m₁)

_((n)) is the M×M₁ update-set projection matrix defined over adaptsymbol n, and where

₁(n)={m∈{1, . . . , M}: m∉

₀(n)} is the complementary update-set defined over adapt symbol n.

The partial-update affine projections (PU-AP) algorithm, which modifiesthe affine projection algorithm taught in [0zeki84,Gay93]

$\begin{matrix}{{{y(n)} = {{X(n)}w}},} & \left( {{Eq}\mspace{14mu} 14} \right) \\{\left. w\leftarrow{w + {\mu\;{X^{\dagger}(n)}} - {y(n)}} \right),{0 < \mu \leq 1},} & \left( {{Eq}\mspace{14mu} 15} \right) \\{{= {{{\arg{\min\limits_{w^{\prime} \in {\mathbb{C}}^{M}}{{w^{\prime} - w}}_{2}^{2}}} \ni {{X(n)}w^{\prime}}} = {\overset{\hat{}}{s}(n)}}},} & \left( {{Eq}\mspace{14mu} 16} \right) \\{{{\overset{\hat{}}{s}(n)} = {{\mu{s(n)}} + {\left( {1 - \mu} \right){y(n)}}}},{0 < \mu \leq 1},} & \left( {{Eq}\mspace{14mu} 17} \right)\end{matrix}$

over N-symbol adapt block n, 1≤N≤M, where X(n)=[x(nN+1) . . .x(nN+N)]^(T) is an N×M data matrix defined over adapt block n,s(n)=[s(nN+1) . . . s(nN+N)]^(T) is a known N×1 reference vector definedover adapt block n, μ is an adaptive stepsize, and (⋅)^(†) denotes thematrix pseudoinverse operation, given by

X ^(†)(n)=X ^(H)(n)(X(n)X ^(H)(n))⁻¹,  (Eq18)

for rank{X(n)}=N≤M, and where (⋅)^(H) and (⋅)⁻¹ denote the matrixconjugate-transpose (Hermitian) and inverse operations, respectively.Addition of constraint (Eq8) to (Eq16) yields the PU-AP taught in[Naylor04],

y(n)=X(n)w,  (Eq19)

=

(n)w,

=0,1,  (Eq20)

(n)=X(n)

(n),

=0,1,  (Eq21)

w ₁ ←w ₁ +μX ₁ ^(†)(n)(s(n)−y(n)), 0<μ≤1,  (Eq22)

w←M ₁(n)w ₁ +M ₀(n)w ₀.  (Eq23)

-   -   The partial-update block least-squares (PU-BLS) algorithm, which        modifies the block least-squares (BLS) algorithm given by

$\begin{matrix}{{{y(n)} = {{X(n)}w}},} & \left( {{Eq}\mspace{14mu} 24} \right) \\{\left. w\leftarrow{{\left( {1 - \mu} \right)w} + {\mu{X^{\dagger}(n)}{s(n)}}} \right.,} & \left( {{Eq}\mspace{14mu} 25} \right) \\{{= {\arg{\min\limits_{w^{\prime} \in {\mathbb{C}}^{M}}{{{\overset{\hat{}}{s}(n)} - {{X(n)}w^{\prime}}}}_{2}^{2}}}},\left\{ \begin{matrix}{{\overset{\hat{}}{s}(n)} = {{\mu{\overset{\hat{}}{s}(n)}} + {\left( {1 - \mu} \right){y(n)}}}} \\{{y(n)} = {{X(n)}w}}\end{matrix} \right.} & \left( {{Eq}\mspace{14mu} 26} \right)\end{matrix}$

over N-symbol adapt block n, N≥M, where ∥⋅∥₂ is the L−2 norm and μ isthe AP adaptive stepsize, and where X(n)=[x(nN+1) . . . x(nN+N)]^(T) isan N×M data matrix defined over adapt block n, s(n)=[s(nN+1) . . .s(nN+N)]^(T) is a known N×1 reference vector defined over adapt block n,and (⋅)^(t) denotes the matrix pseudoinverse operation, given by

X ^(†)(n)=(X ^(H)(n)X(n))⁻¹ x ^(H)(n)  (Eq27)

for rank {X(n)}=M≤N. Addition of constraint (Eq8) to (Eq26) yieldsPU-BLS algorithm

y(n)=X(n)w,  (Eq28)

=

(n)w,

=0,1,  (Eq29)

(n)=X(n)

(n),

=0,1,  (Eq30)

w ₁←(1−μ)w ₁ +μX ₁ ^(†)(n)(s(n)−y(n)), 0<μ≤1,  (Eq31)

w←M ₁(n)w ₁ +M ₀(n)w ₀.  (Eq32)

The BLS and PU-BLS algorithms can be interpreted as extensions of the APand PU-AP algorithms to adapt block sizes N≥M. Similarly, the NLMS andPU-NLMS algorithms can be interpreted as implementations of the AP andPU-AP algorithms for N=1.

A number of observations can immediately be made from thisinterpretation of the partial-update procedure. First, any linearconstraint can induce severe misadjustment from the optimal solutionsought by the processor. This can manifest as both a convergent orsteady-state bias from the optimal solution, and a “jitter” orfluctuation about that steady-state solution. In some applications,e.g., phased array radar applications where the received radar waveformmust be extracted from strong clutter and jamming, this can cause thesystem to fail entirely. Even if the reference signal is received athigh SINR, this can lead to well-known “hypersensitivity” issues thatdegrade system performance from the optimal solution.

Second, the linear constraint can only be easily added to a small subsetof optimization functions. In many cases, strict enforcement of theconstraint significantly increases complexity of the original method.

The subspace-constrained approach overcomes both of these problems, byreplacing the hard linear constraint M₀(n)w′=M₀(n)w with a softersubspace constraint

$\begin{matrix}{\begin{matrix}{{{M_{0}^{T}(n)}w^{\prime}} \propto {{M_{0}^{T}(n)}w}} \\{{= {{M_{0}^{T}(n)}{wg}_{0}}},{g_{0} \in {\mathbb{C}}},}\end{matrix}\quad} & \left( {{Eq}\mspace{14mu} 33} \right)\end{matrix}$

where the scalar held-set multiplier g₀ and the update-set weightsw₁=M₁(n)w are jointly adjusted to optimize the unconstrained criteriongiven in (Eq1), i.e., by adapting (M₁+1)×1 enhanced weight vector

$\overset{\sim}{w} = \begin{pmatrix}w_{1} \\g_{0}\end{pmatrix}$

using optimization formula

$\begin{matrix}\left. \overset{\sim}{w}\leftarrow{\arg\underset{{\overset{\sim}{w}}^{\prime} \in {\mathbb{C}}^{M_{1} + 1}}{opt}{F\left( {{\overset{\sim}{w}}^{\prime};n} \right)}} \right. & \left( {{Eq}\mspace{14mu} 34} \right)\end{matrix}$

over each data block. The full output weight vector is then given by

w=M ₁(n)w ₁ +M ₀(n)w ₀ g ₀,  (Eq35)

-   -   which is efficiently computed using vector-scalar multiplies and        multiply-free multiplexing (MUX) operations.        For the exemplary NLMS, AP, and BLS optimization criteria given        in (Eq6), (Eq16), and (Eq26), respectively, the SCPU algorithms        are implemented using the following procedure:    -   Separate w into update-set and held-set components w₁ and w₀        using multiply-free demultiplexing (DMX) operations.    -   For the SCPU-AP/NLMS algorithms, and for the SCPU-BLS algorithm        with μ<1, construct (M₁+1)×1 dimensional weight matrix

$\begin{matrix}{\overset{\sim}{w} = {\begin{pmatrix}w_{1} \\1\end{pmatrix}.}} & \left( {{Eq}\mspace{14mu} 36} \right)\end{matrix}$

-   -   Separate X(n) into update-set and held-set components X₁(n) and        X₀(n) using multiply-free columnar DMX operations.        -   Compute y₀(n)=X₀(n)w₀. For the SCPU-AP/NLMS algorithms,            further compute y₁(n)=X₁(n)w₁ and y(n)=y₁(n)+y₀(n).        -   Construct N×(M₁+1) dimensional SCPU data matrix

{tilde over (X)}(n)=[X ₁(n) y ₀(n)].  (Eq37)

-   -   Note that y(n)={tilde over (X)}(n){tilde over (w)} constructs        the output data from the prior weight set.        -   Optimize {tilde over (w)} using the original unconstrained            algorithm, with dimensionality reduced from M to M₁+1,            yielding

SCPU-AP: {tilde over (w)}←{tilde over (w)}+μ{tilde over (x)}^(†)(n)(s(n)−y(n)), 0<μ≤1  (Eq38)

SCPU-BLS: {tilde over (w)}←(1−μ){tilde over (w)}+μ{tilde over (X)}^(†)(n)s(n), 0<μ≤1,  (Eq39)

-   -   where the SCPU-AP algorithm degenerates to SCPU-NLMS if N=1, and        extends to SCPU-BLS if N≥M₁.    -   Update w₁ and w₀ using formula

$\begin{matrix}\left. w_{1}\leftarrow\left\lbrack \left( \overset{\sim}{w} \right)_{m} \right\rbrack_{m = 1}^{M_{1}} \right. & \left( {{Eq}\mspace{14mu} 40} \right) \\{\left. w_{0}\leftarrow{g_{0}w_{0}} \right.,{g_{0} = \left( \overset{\sim}{w} \right)_{M_{1} + 1^{\prime}}}} & \left( {{Eq}\mspace{14mu} 41} \right)\end{matrix}$

-   -   -   where ({tilde over (w)})_(m) denotes the m^(th) element of            vector {tilde over (w)}.

    -   Reconstruct the linear combiner weights using.

w=M ₁(n)w ₁ +M ₀(n)w ₀.  (Eq42)

The SCPU approach employs a data matrix with a nominal dimensionalityincrease of one over the equivalent PU data matrix, and requires anadditional M₀ complex multiplies to update the held-set weight vector.This complexity increase is substantive for the PU-NLMS algorithm, whichhas O(M) complexity on the data path and O(M₁) complexity on the adaptpath. However, this complexity increase is minor for the PU-BLSalgorithm, which has O(M₁ ²) complexity on the adapt path, and for thePU-AP algorithm if the adapt block size N is less than but on the orderof the number of updated weights M₁ (N≤M₁). Because the SCPU approachoptimizes g₀=({tilde over (w)})_(M) ₁ ₊₁ over the complex field, thealgorithm is also guaranteed to have lower misadjustment than theequivalent PU method, which constrains g₀ to unity.

This implementation of a SCPU algorithm obtains a higher degree ofefficiency (in comparison with either an unconstrained partial update,or full update algorithm) through reducing the level of repetitiveprocessing and comparison which is needed to obtain themaximally-beneficial level, and mixture, of signal weightings that, whenapplied to the next processing effort, will produce the correct answerwithin the noise constraints. If applied so as to removearbitrarily-imposed limits on either the processing depth, or on thenumber of criteria to be evaluated, then a satisficing level of accuracycan be reached without sacrificing the capacities which were otherwiseartificially constrained. Since the weighting dimensionality is reducedby and to the level of the constraints on the subspace, without changingthe data path, the efficiency of the transforming process is improvedover the full analytical processing effort.

The SCPU algorithm employs a data matrix with a nominal dimensionalityincrease of one over the equivalent partial-update (PU) data matrix, andwhich employs an additional O(M₀) complex scalar-vector multiplier toupdate the held-set weight vector. This can be expressed in thefollowing compact matrix notation:

$\begin{matrix}{{\overset{\sim}{M}(n)} = \left\lbrack {{M_{1}(n)}\mspace{14mu}{M_{0}(n)}{M_{0}^{T}(n)}w} \right\rbrack} & \left( {{Eq}\mspace{14mu} 43} \right) \\{{\overset{\sim}{X}(n)} = {{X(n)}{\overset{\sim}{M}(n)}}} & \left( {{Eq}\mspace{14mu} 44} \right) \\{\overset{\sim}{w} = {\arg\underset{\overset{\sim}{w} \in {\mathbb{C}}^{M_{1} + 1}}{opt}{F\left( {\overset{\sim}{w};n} \right)}}} & \left( {{Eq}\mspace{14mu} 45} \right) \\{{y(n)} = {{\overset{\sim}{X}(n)}\overset{\sim}{w}}} & \left( {{Eq}\mspace{14mu} 46} \right) \\{\left. w\leftarrow{{\overset{\sim}{M}(n)}\overset{\sim}{w}} \right.,} & \left( {{Eq}\mspace{14mu} 47} \right)\end{matrix}$

where {tilde over (M)}(n) is an M×(M₁+1) sparse mapping matrix thatreduces dimensionality of X(n) ahead of the optimization algorithmdescribed symbolically in (Eq45).

This compact notation reveals some additional advantages of theapproach:

-   -   The approach is inherently more stable than the unconstrained        algorithm on a block-by-block basis, because it updates fewer        weights than the unconstrained method, without introducing        explicit hard constraints that lead to adaptive “jitter.”        Hypersensitivity effects due to large noise subspaces in the        received data should be especially reduced in the SCPU method.    -   The approach is usable with any optimization criterion,        including non-quadratic criteria such as general and analytic        constant modulus cost functions [Treichler83,Agee86,Van Der Veen        96], cumulant based objective functions, and eigenvalue-based        objective functions [Agee89b,Agee90].

The approach admits both SCPU maximum-likelihood signal and parameterestimation approaches, and reduced-complexity, constrained qualitymetrics such as signal-to-interference-and-noise ratio (SINR),Cramer-Rao bounds on parameter estimates, and information-theoreticchannel capacity. These metrics may lead to new update-set selectionstrategies that can overcome identified issues with methods developed todate.

The mapping given in (Eq43) can be extended in many ways to enhanceother attributes of the algorithm, e.g., ability to track multiplesignals, new selection strategies, and so on. In particular, theapproach immediately yields nonblind multiport extensions in whichadaptation algorithms are used to extract multiple signals from areceived environment.

Two multiport extensions of the AP and BLS methods are taught here basedon the unconstrained nonblind algorithms given by

AP: W←W+μX ^(†)(n)(S(n)−Y(n)),  (Eq47)

BLS: W←(1−μ)W+μX ^(†)(n)S(n),  (Eq48)

where W is an M×P combiner matrix, Y(n)=X(n)W is an N×P matrix ofcombiner output data formed over adapt block n using W, and S(n) is anN×P matrix of reference data known over adapt block n. These multiportextensions include the following:

-   -   An uncoupled multiport extension in which (Eq43) is replaced by        P separate mapping matrices

{tilde over (M)}(n; p)=[M ₁(n)M ₀(n)M ₀ ^(T)(n)w(p)], p=1, . . .,P,  (Eq49)

{tilde over (X)}(n;p)=X(n){tilde over (M)}(n; p), p=1, . . . ,P,  (Eq50)

i.e., the SCPU constraint (Eq33) is broadened to P separate constraints

M ₀ ^(T)(n){tilde over (w)}(p)=M ₀ ^(T)(n)w(p)g ₀(p)∈

, p=1, . . . ,P.  (Eq51)

-   -   The uncoupled SCPU-BLS algorithm is then given by

$\begin{matrix}{{\overset{\sim}{X}\left( {n;p} \right)} = {{X(n)}{\overset{\sim}{M}\left( {n;p} \right)}}} & \left( {{Eq}\mspace{14mu} 52} \right) \\{{\overset{\sim}{w}(p)} = \begin{pmatrix}{{M_{1}^{T}(n)}{w(p)}} \\1\end{pmatrix}} & \left( {{Eq}\mspace{14mu} 53} \right) \\{\left. {\overset{\sim}{w}(p)}\leftarrow{{\left( {1 - \mu} \right){\overset{\sim}{w}(p)}} + {\mu{{\overset{\sim}{X}}^{\dagger}\left( {n;p} \right)}{s\left( {n;p} \right)}}} \right.,{0 < \mu \leq 1}} & \left( {{Eq}\mspace{14mu} 54} \right) \\{{y\left( {n;p} \right)} = {{\overset{\sim}{X}\left( {n;p} \right)}{\overset{\sim}{w}(p)}}} & \left( {{Eq}\mspace{14mu} 55} \right) \\\left. {w(p)}\leftarrow{{\overset{\sim}{M}\left( {n;p} \right)}{\overset{\sim}{w}(p)}} \right. & \left( {{Eq}\mspace{14mu} 56} \right)\end{matrix}$

-   -   for each port p=1, . . . ,P where s(n;p) and w(p) are the p^(th)        column of S(n) and W, respectively, and where (Eq53) is only        needed if μ<1.    -   A fully-coupled multiport extension, in which (Eq43) is replaced        by global mapping matrix

{tilde over (M)}(n)=[M ₁(n)M ₀(n)M ₀ ^(T)(n)W],  (Eq57)

-   -   i.e., the SCPU constraint (Eq33) is broadened to

M ₀ ^(T)(n)W′=M ₀ ^(T)(n)WG ₀ , G ₀∈

^(P×P)  (Eq58)

-   -   The fully-coupled SCPU-BLS algorithm is then given by

$\begin{matrix}{{{\overset{\sim}{X}(n)} = {{X(n)}{\overset{\sim}{M}(n)}}},} & \left( {{Eq}\mspace{14mu} 59} \right) \\{\overset{\sim}{W} = \begin{pmatrix}{{M_{1}^{T}(n)}W} \\I_{P}\end{pmatrix}} & \left( {{Eq}\mspace{14mu} 60} \right) \\{\left. \overset{\sim}{W}\leftarrow{{\left( {1 - \mu} \right)\overset{\sim}{W}} + {\mu{{\overset{\sim}{X}}^{\dagger}(n)}{S(n)}}} \right.,{0 < \mu \leq 1}} & \left( {{Eq}\mspace{14mu} 61} \right) \\{{Y(n)} = {{\overset{\sim}{X}(n)}\overset{\sim}{W}}} & \left( {{Eq}\mspace{14mu} 62} \right) \\{\left. W\leftarrow{{\overset{\sim}{M}(n)}\overset{\sim}{W}} \right.,} & \left( {{Eq}\mspace{14mu} 63} \right)\end{matrix}$

and where (Eq60) is only needed if μ<1.

In an efficient embodiment, the uncoupled multiport SCPU-BLS extensionis implemented using whitening methods that exploit the commoncomponents of {{tilde over (X)}(n;p)}_(p=1) ^(P), i.e., the N×M₁dimensional update-set data matrix X₁(n)=X(n)M₁(n).

In particular, using the QR decomposition of {tilde over (X)}(n;p),given by

$\begin{matrix}{{\left\{ {Q,R} \right\} = {{QRD}(X)}},\left\{ \begin{matrix}{R = {{chol}\left\{ {X^{H}X} \right\}}} \\{Q = {XR}^{- 1}}\end{matrix} \right.} & \left( {{Eq}\mspace{14mu} 64} \right) \\\left. \Rightarrow\left\{ \begin{matrix}{X\  = \ QR} \\{Q^{H}Q = I_{N}}\end{matrix} \right. \right. & \left( {{Eq}\mspace{14mu} 65} \right)\end{matrix}$

for general N×M matrix X with rank{X}=N≥M, where I_(N) is the N×Nidentity matrix and chol{⋅} is the Cholesky decomposition yieldingupper-triangular matrix R with real-positive diagonal values, then theuncoupled multiport SCPU-BLS algorithm given in (Eq54) can beefficiently implemented by first computing the QRD of the commonupdate-set data matrix,

{Q ₁ ,R ₁₁ }=QRD(X ₁(n)),  (Eq66)

and then updating each port p using the recursion

$\begin{matrix}\left. y_{0}\leftarrow{{X_{0}(n)}{w_{0}(p)}} \right. & \left( {{Eq}\mspace{14mu} 67} \right) \\\left. r_{10}\leftarrow{Q_{1}^{H}y_{0}} \right. & \left( {{Eq}\mspace{14mu} 68} \right) \\\left. u_{1}\leftarrow{Q_{1}^{H}{s\left( {n;p} \right)}} \right. & \left( {{Eq}\mspace{14mu} 69} \right) \\\left. {g_{0}(p)}\leftarrow\frac{{y_{0}^{H}{s\left( {n,p} \right)}} - {r_{10}^{H}u_{1}}}{{y_{0}}_{2}^{2} - {r_{10}}_{2}^{2}} \right. & \left( {{Eq}\mspace{14mu} 70} \right) \\{\left. {w_{1}(p)}\leftarrow{{\left( {1 - \mu} \right){w_{1}(p)}} + {\mu{R_{11}^{- 1}\left( {u_{1} - {r_{10}{g_{0}(p)}}} \right)}}} \right.,{0 < \mu \leq 1}} & \left( {{Eq}\mspace{14mu} 71} \right) \\{\left. {g_{0}(p)}\leftarrow{\left( {1 - \mu} \right) + {\mu{g_{0}(p)}}} \right.,{0 < \mu \leq 1}} & \left( {{Eq}\mspace{14mu} 72} \right)\end{matrix}$

where

${\overset{\sim}{w}(p)} = {\begin{pmatrix}{w_{1}(p)} \\{g_{0}(p)}\end{pmatrix}.}$

This recursion also admits unbiased quality statistic

$\begin{matrix}{{{{\overset{\sim}{\gamma}}_{\max}\left( {n;p} \right)} = {{\left( {1 - \frac{M_{1} + 1}{N}} \right)\left( \frac{\overset{\sim}{\eta}}{1 - \overset{\sim}{\eta}} \right)} - \frac{M_{1} + 1}{N}}},{\overset{\sim}{\eta} = \frac{{u_{1}}_{2}^{2} + {u_{0}}^{2}}{{{s\left( {n;p} \right)}}_{2}^{2}}}} & \left( {{Eq}\mspace{14mu} 73} \right)\end{matrix}$

for each port p, which estimates the relative power between the port preference signal and background clutter at the output of the port plinear combiner, also referred to as thesignal-and-interference-and-noise ratio (SINR) of the combiner outputsignal.

The SCPU method is also easily extended to partially blind methods inwhich the reference vector s(n) is partially known at the receiveprocessor over adapt block n, e.g., the reference vector has an unknowncarrier or timing offset relative to the sequence contained in the inputdata sequence, and to fully blind methods in which the reference vectoris unknown but has some known, exploitable structure. Specific examplesinclude:

-   -   Carrier-timing tracking SCPU-BLS algorithms, in which s(n) has        an unknown timing and/or carrier offset, e.g., due to        propagation delay, Doppler shift and carrier LO uncertainty        between the input data and an original transmitted signal        containing the reference signal, or a combined frequency shift        due to timing and carrier offset if the input data is derived        from an OFDM or OFDMA demodulation process. This algorithm        replaces the nonblind weight adaptation algorithm given in        (Eq39) with

$\begin{matrix}\left. \overset{\sim}{w}\leftarrow{{\left( {1 - \mu} \right)\overset{\sim}{w}} + {\mu{{\overset{\sim}{X}}^{\dagger}(n)}\left( {{s\left( {{\hat{n}}_{off};n} \right)} \circ {\delta\left( {\hat{\omega}}_{off} \right)}} \right)}} \right. & ({Eq74}) \\{{s\left( {n_{off};n} \right)} = \left\lbrack {s\left( {{nN} + n_{sym} + n_{off}} \right)} \right\rbrack_{n_{sym} = 1}^{N}} & ({Eq75}) \\{{\delta\left( \omega_{off} \right)} = \left\lbrack e^{j\;\omega_{off}n_{sym}} \right\rbrack_{n_{sym} = 1}^{N}} & ({Eq76})\end{matrix}$

where {s(n_(sym))} is a component of the transmitted signal that isknown over the adapt block except for timing offset n_(off) and acarrier offset ω_(off), and where “º” is the element-wise matrixmultiplication operation. The timing and carrier offset can be optimizedover each adapt block by setting

$\begin{matrix}{{\left\{ {{{\overset{\sim}{\omega}}_{off}(n)},{{\hat{n}}_{off}(n)}} \right\} = {\arg{\max\limits_{\omega_{off},n_{off}}{\eta\left( {\omega_{off},{n_{off};n}} \right)}}}},} & \left( {{Eq}\mspace{14mu} 77} \right) \\{{{\eta\left( {\omega_{off},{n_{off};n}} \right)} = \frac{{{{{\overset{\sim}{Q}}^{H}(n)}\left( {{s\left( {n_{off};n} \right)} \circ {\delta\left( \omega_{off} \right)}} \right)}}_{2}^{2}}{{{s\left( {n_{off};n} \right)}}_{2}^{2}}},} & \left( {{Eq}\mspace{14mu} 78} \right) \\{{= \frac{{{\sum\limits_{n_{sym} = 1}^{N}{{{\overset{\sim}{q}}^{*}\left( n_{sym} \right)}{s\left( {{nN} + n_{sym} + n_{off}} \right)}e^{j\;\omega_{off}n_{sym}}}}}_{2}^{2}}{\sum\limits_{n_{sym} = 1}^{N}{{s\left( {{nN} + n_{sym} + n_{off}} \right)}}^{2}}},} & \left( {{Eq}\mspace{14mu} 79} \right)\end{matrix}$

where {tilde over (Q)}=[{tilde over (q)}(1) . . . {tilde over(q)}(N)]^(T) is the Q-component of the QRD of {tilde over (x)}(n).Equation (Eq77) can be efficiently implemented using fast Fouriertransform (FFT) methods if the frequency offset w is completely unknown(acquisition phases), and using Gauss-Newton or Newton methods if thefrequency offset co is known closely (tracking phases).

Equation (Eq77) also admits quality statistic

$\begin{matrix}{{{\overset{\sim}{\gamma}\left( {\omega_{off},{n_{off};n}} \right)} = {{\left( {1 - \frac{M_{1} + 1}{N}} \right)\left( \frac{\overset{\sim}{\eta}\left( {\omega_{off},{n_{off};n}} \right)}{1 - {\overset{\sim}{\eta}\left( {\omega_{off},{n_{off};n}} \right)}} \right)} - \frac{M_{1} + 1}{N}}},} & \left( {{Eq}\mspace{14mu} 80} \right)\end{matrix}$

which estimates the SINR of the combiner output signal.

-   -   Property-mapping SCPU-BLS algorithms, in which s(n) is a member        of a known property set. These algorithms replace the nonblind        weight adaptation algorithm given in (Eq39) with        property-mapping recursion

$\begin{matrix}{{y(n)} = {{\overset{\sim}{X}(n)}\overset{\sim}{w}}} & \left( {{Eq}\mspace{14mu} 81} \right) \\{{\hat{s}(n)} = {\arg\;{\min\limits_{s \in {\mathcal{D}{(n)}}}{{s - {y(n)}}}}}} & \left( {{Eq}\mspace{14mu} 82} \right) \\{\left. \overset{\sim}{w}\leftarrow{{\left( {1 - \mu} \right)\overset{\sim}{w}} + {\mu{{\overset{\sim}{X}}^{\dagger}(n)}{\hat{s}(n)}}} \right.,} & \left( {{Eq}\mspace{14mu} 83} \right)\end{matrix}$

-   -   -   where            (n) is a desired signal set, potentially variable as a            function of adapt block n, that s(n) is known to belong to.            For example, the constant modulus property set:            (n)={z∈            ^(N): |(z)_(n)|=1} yields

ŝ(n)=sgn{y(n)}  (Eq84)

-   -   -   where sgn{.} is the element-wise complex sign function            sgn{z}=z/|z| on each element, resulting in an SCPU-BLS            constant-modulus algorithm. Other exemplary mappings include            known modulus mappings in which the elements of s(n) have            known magnitude but unknown phase, and decision-direction            mappings in which each element of s(n) belongs to a known            set of finite values, possibly with an unknown carrier            offset.        -   In all cases, the property-mapping algorithm is applicable            to cases in which s(n) does not perfectly possess the            property used by the algorithm, but substantively conforms            to that property, e.g., |s(nN+n_(sym))|≈1.

    -   Dominant-mode prediction (DMP) algorithms, in which s(n) is        known to be substantively present in a linear subspace with        known or estimable structure, such that

s(n)≈(U _(s)(n)U _(s) ^(H)(n))s(n)  (Eq85)

for a known or postulated N×N_(s)(n) orthonormal basis U_(s)(n),N_(s)(n)<N, and/or such that s(n) is known to be substantively absent alinear subspace with known or estimable structure, such that

U _(⊥) ^(H)(n)s(n)≈0  (Eq86)

for a known or postulated complementary N×N_(⊥)(n) orthonormal basisU_(⊥)(n), in which N_(s)(n)+N_(⊥)(n)≤N. If only one subspace isavailable, one can be derived from the other, for example by derivingU_(⊥)(n) from I_(N) (U_(s)(n)U_(s) ^(H)(n)) or vice verse.

In this case, the enhanced weight update algorithm is given by

$\begin{matrix}{{\overset{\sim}{w} = {\arg{\max\limits_{w \in {\mathbb{C}}^{M_{1} + 1}}{\overset{\sim}{\gamma}\left( {w;n} \right)}}}},} & \left( {{Eq}\mspace{14mu} 87} \right) \\{{{\overset{\sim}{\gamma}\left( {w;n} \right)} = \frac{{w^{H}\left( {{{\overset{˜}{X}}_{s}^{H}(n)}{{\overset{˜}{X}}_{s}(n)}} \right)}w}{{w^{H}\left( {{{\overset{˜}{X}}_{\bot}^{H}(n)}{{\overset{˜}{X}}_{\bot}(n)}} \right)}w}},\left\{ {\begin{matrix}{{{\overset{˜}{X}}_{s}(n)} = {{U_{s}^{H}(n)}{\overset{˜}{X}(n)}}} \\{{{\overset{˜}{X}}_{\bot}(n)} = {{U_{\bot}^{H}(n)}{\overset{˜}{X}(n)}}}\end{matrix}.} \right.} & \left( {{Eq}\mspace{14mu} 88} \right)\end{matrix}$

The enhanced combiner weights {tilde over (w)}_(max) that maximize(Eq88), and the maximal value of (Eq88), {tilde over (γ)}_(max), areequal to the dominant solution {{tilde over (γ)}₁,{tilde over (w)}₁} ofthe DMP eigenequation,

{tilde over (γ)}_(m)({tilde over (X)} _(⊥) ^(H)(n){tilde over (X)}_(⊥)(n)){tilde over (w)} _(m)=({tilde over (X)} _(s) ^(H)(n){tilde over(X)} _(s)(n)){tilde over (w)} _(m), {tilde over (γ)}_(m)≥{tilde over(γ)}_(m+1)  (Eq89)

The dominant eigenvalue {tilde over (γ)}₁ also provides an estimate ofthe SINR of the combiner output signal, and can be used both to detectthe target signal, and to search over postulated subspaces to find thesubspace that most closely contains or rejects s(n).Example subspaces include:

-   -   Known or postulated time slots used by s(n), such that        |s(nN+n_(sym))|«1/N∥s(n)∥₂ over some known or searchable subset        of symbol indices within the adapt block. This generates SCPU        time-gated DMP (TG-DMP) algorithms.    -   Known or postulated frequency channels used by s(n), such that

$\begin{matrix}{\frac{{{\sum\limits_{n_{sym} = 1}^{N - 1}{{h\left( n_{sym} \right)}{s\left( {{nN} + n_{sym}} \right)}e^{{- j}\omega n_{sym}}}}}^{2}}{\sum\limits_{n_{sym} = 1}^{N - 1}{{h^{2}\left( n_{sym} \right)}{\sum\limits_{n_{sym} = 1}^{N - 1}{{h\left( n_{sym} \right)}{{s\left( {{nN} + n_{sym}} \right)}}^{2}}}}} ⪡ 1} & \left( {{Eq}\mspace{14mu} 90} \right)\end{matrix}$

-   -   over a known or searchable subset of frequency offsets {ω_(⊥)},        where

{h(n_(sym))}_(n_(sym) = 1)^(N)

is a lowpass windowing function, e.g., a Gaussian or Hamming sym window.This generates SCPU frequency-gated DMP (FG-DMP) algorithms.

-   -   Known or postulated CDMA codes used by s(n), such that

$\begin{matrix}{\frac{{{\sum\limits_{n_{sym} = 1}^{N - 1}{{c^{*}\left( {n_{sym} - n_{off}} \right)}{s\left( {{nN} + n_{sym}} \right)}e^{{- j}\omega_{off}n_{sym}}}}}^{2}}{\sum\limits_{n_{sym} = 1}^{N - 1}{{{c\left( {n_{sym} - n_{off}} \right)}}^{2}{\sum\limits_{n_{sym} = 1}^{N - 1}{{s\left( {{nN} + n_{sym}} \right)}}^{2}}}} \approx 1} & \left( {{Eq}\mspace{14mu} 91} \right)\end{matrix}$

-   -   over a known or searchable subset of carrier offsets {w_(off)}        and timing offsets {n_(off)}, where {c(n_(sym))} is a known        spreading code. This generates SCPU code-gated DMP (CG-DMP)        algorithms.    -   Known or postulated restricted isometry properties (RIP)        possessed by s(n), such that it occupies a sparse subset of a        basis U_(s) that is known (oracular basis), or that satisfies        some sparsity property (general RIP). This generates adaptive        decompression algorithms in compressed sensing applications.    -   Conjugate self-coherence restoral (C-SCORE) algorithms, in which        s(n) is known to have substantive conjugate self-coherence at        some known or estimable frequency offset w, such that

$\begin{matrix}{\frac{{\sum\limits_{n_{sym} = 1}^{N}{{s^{2}\left( {{nN} + n_{sym}} \right)}e^{{- j}\omega n_{sym}}}}}{\sum\limits_{n_{sym} = 1}^{N}{{s\left( {{nN} + n_{sym}} \right)}}^{2}} \approx 1} & \left( {{Eq}\mspace{14mu} 92} \right)\end{matrix}$

-   -   In this case, the enhanced weight update algorithm is given by

$\begin{matrix}{\mspace{79mu}{\overset{\sim}{w} = {\arg{\max\limits_{w \in {\mathbb{C}}^{M_{1} + 1}}{\overset{\sim}{\rho}\left( {{w\left. {\omega;n} \right)},} \right.}}}}} & \left( {{Eq}\mspace{14mu} 93} \right) \\{\overset{˜}{\rho}\left( {{{w\left. {\omega;n} \right)} = \frac{{{w^{H}\left( {{{\overset{˜}{X}}^{H}(n)}{\Delta(\omega)}{{\overset{˜}{X}}^{*}(n)}} \right)}w^{*}}}{{w^{H}\left( {{{\overset{˜}{X}}^{H}(n)}{\overset{˜}{X}(n)}} \right)}w}},{{\Delta(\omega)} = {{diag}\left\{ e^{j\omega n_{sym}} \right\}_{n_{sym} = 1}^{N}}}} \right.} & \left( {{Eq}\mspace{14mu} 94} \right)\end{matrix}$

-   -   for a postulated twice-carrier offset ω. The enhanced combiner        weights {tilde over (w)}_(max) that maximize (Eq94), and the        maximal value of (Eq88), {tilde over (ρ)}_(max) (ω;n), are equal        to the dominant solution {{tilde over (ρ)}₁(ω),{tilde over        (w)}₁(ω)} of the C-SCORE pseudo-eigenequation,

{tilde over (ρ)}_(m)(ω)({tilde over (X)} ^(H)(n){tilde over(X)}(n)){tilde over (w)} _(m)(ω)=({tilde over (X)} ^(H)(n)Δ(ω){tildeover (X)}*(n)){tilde over (w)}* _(m)(ω), {tilde over (ρ)}_(m)≥{tildeover (ρ)}_(m+1).  (Eq95)

-   -   The SC-PU C-SCORE algorithm is expected to have application to        BPSK, MSK, and GMSK signals, such as 1 Mbps (BPSK) 802.11 DSSS        signal. The algorithm also extends to both carrier-tracking        algorithms where an FFT-based search algorithm. In this case,        the line spectrum used to detect the SOI's will either be the        dominant pseudoeigenmode {tilde over (ρ)}_(max)(ω;n).

Extensions of all of these algorithms to fully-coupled and uncoupledmultiport SCPU methods is straightforward.

It should also be recognized that, while all of the techniques describedhere are defined over the “complex field,” such that w∈

^(M), they are equally applicable combiners and optimization metricsdefined over other fields, including the real field, e.g., w∈

^(M), and Galois fields usable in integer field codes. In each case, thesubspace constraint

$\begin{matrix}{\begin{matrix}{{{M_{0}^{T}(n)}w^{\prime}} \propto {{M_{0}^{T}(n)}w}} \\{{= {{M_{0}^{T}(n)}{wg}_{0}}},{g_{0} \in {\mathbb{S}}},}\end{matrix}\quad} & \left( {{Eq}\mspace{14mu} 96} \right)\end{matrix}$

where

is the field in which each element of w is defined, results in a validSCPU method. The method is also applicable to linear-conjugate-linear(LCL) methods

$\begin{matrix}{\begin{matrix}{{{{{\overset{\_}{M}}_{0}^{T}(n)}{\overset{\_}{w}}^{\prime}} \propto {{{\overset{\_}{M}}_{0}^{T}(n)}\overset{\_}{w}}},\left\{ \begin{matrix}{\overset{\_}{w} = {\frac{1}{\sqrt{2}}\begin{pmatrix}w \\w^{*}\end{pmatrix}}} \\{{M_{\ell}(n)} = \left\lbrack {{M_{\ell}(n)}\mspace{14mu}{M_{\ell}(n)}} \right\rbrack}\end{matrix} \right.} \\{{= {{{\overset{\_}{M}}_{0}^{T}(n)}\overset{\_}{w}\mspace{11mu}{\overset{\_}{g}}_{0}}},{{\overset{\_}{g}}_{0} = {\frac{1}{\sqrt{2}}\begin{pmatrix}g_{0} \\g_{0}^{*}\end{pmatrix}}},}\end{matrix}\quad} & \left( {{Eq}\mspace{14mu} 97} \right)\end{matrix}$

which allows the SCPU method to be applied to optimization functionsthat are more complicated functions of complex variables. Moreover thetechniques are applicable to processors that implement nonlinearfunctions on the data path as well as the adapt path, if the originaloptimization constraint is a linear function of w. In a preferredembodiment the step of performing a dimensionality reduction comprisinga linear transformation of the processor parameters being adapted fromM-dimensions to (M₁+L)-dimensions in each adaptation event comprisesapplying a subspace constraint described in the form:

$\begin{matrix}{{{M_{0}^{T}(n)}w^{\prime}} \propto {{M_{0}^{T}(n)}w}} & {{~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~}\left( {{Eq}\mspace{14mu} 98} \right)} \\{= {{M_{0}^{T}(n)}{wg}_{0}}} & {\left( {{Eq}\mspace{14mu} 99} \right)}\end{matrix}\quad$

While this invention is susceptible of embodiment in many differentforms, there is shown in the drawings and will herein be described indetail several specific embodiments with the understanding that thepresent disclosure is to be considered as an exemplification of theprinciples of the invention and is not intended to limit the inventionto the embodiments illustrated.

Some of the above-described functions may be composed of instructions,or depend upon and use data, that are stored on storage media (e.g.,computer-readable medium). The instructions and/or data may be retrievedand executed by the processor. Some examples of storage media are memorydevices, tapes, disks, and the like. The instructions are operationalwhen executed by the processor to direct the processor to operate inaccord with the invention; and the data is used when it forms part ofany instruction or result therefrom.

The terms “computer-readable storage medium” and “computer-readablestorage media” as used herein refer to any medium or media thatparticipate in providing instructions to a CPU for execution. Such mediacan take many forms, including, but not limited to, non-volatile (alsoknown as ‘static’ or ‘long-term’) media, volatile media and transmissionmedia. Non-volatile media include, for example, one or more optical ormagnetic disks, such as a fixed disk, or a hard drive. Volatile mediainclude dynamic memory, such as system RAM or transmission or bus‘buffers’. Common forms of computer-readable media include, for example,a floppy disk, a flexible disk, a hard disk, magnetic tape, any othermagnetic medium, a CD-ROM disk, digital video disk (DVD), any otheroptical medium, any other physical medium with patterns of marks orholes.

“Memory”, as used herein when referencing to computers, is thefunctional hardware that for the period of use retains a specificstructure which can be and is used by the computer to represent thecoding, whether data or instruction, which the computer uses to performits function. Memory thus can be volatile or static, and be any of aRAM, a PROM, an EPROM, an EEPROM, a FLASH EPROM, any other memory chipor cartridge, a carrier wave, or any other medium from which a computercan read data, instructions, or both.

“I/O”, or ‘input/output’, is any means whereby the computer can exchangeinformation with the world external to the computer. This can include awired, wireless, acoustic, infrared, or other communications link(including specifically voice or data telephony); a keyboard, tablet,camera, video input, audio input, pen, or other sensor; and a display(2D or 3D, plasma, LED, CRT, tactile, or audio). That which allowsanother device, or a human, to interact with and exchange data with, orcontrol and command, a computer, is an I/O device, without which anycomputer (or human) is essentially in a solipsitic solipsistic state.

The above description of the invention is illustrative and notrestrictive. Many variations of the invention may become apparent tothose of skill in the art upon review of this disclosure. The scope ofthe invention should, therefore, be determined not with reference to theabove description, but instead should be determined with reference tothe appended claims along with their full scope of equivalents.

While the present invention has been described in connection with atleast one preferred embodiment, these descriptions are not intended tolimit the scope of the invention to the particular forms (whetherelements of any device or architecture, or steps of any method) setforth herein. It will be further understood that the elements, or stepsin methods, of the invention are not necessarily limited to the discreteelements or steps, or the precise connectivity of the elements or orderof the steps described, particularly where elements or steps which arepart of the prior art are not referenced (and are not claimed). To thecontrary, the present descriptions are intended to cover suchalternatives, modifications, and equivalents as may be included withinthe spirit and scope of the invention as defined by the appended claimsand otherwise appreciated by one of ordinary skill in the art.

1. A method of implementing partial updates in an adaptive processorthat adjusts weights to optimize an adaptation criterion in a signalestimation or parameter estimation algorithm, the method comprising:selecting, from a set of weights, a set of update weights and a set ofheld weights; performing unconstrained updates to the set of updateweights; and performing updates to the set of held weights within areduced-dimensionality subspace, wherein performing updates to the setof held weights and performing unconstrained updates to the set ofupdate weights uses adapt-path operations for tuning the adaptiveprocessor to process signal data during or after tuning.
 2. The methodof claim 1, wherein performing unconstrained updates to the set ofupdate weights and performing updates to the set of held weightscomprises constructing a set of enhanced weights from the set of updateweights and a projection of the set of held weights onto the reduceddimensionality subspace, and performing an unconstrained update to theset of enhanced weights.
 3. The method of claim 1, wherein performingunconstrained updates to the set of update weights and performingupdates to the set of held weights comprises at least one of performinga single-port partial-update adaptation algorithm, an uncoupledmultiport adapt-path weight update algorithm, a fully-coupled multiportadapt-path weight update algorithm, a partial-update normalizedleast-mean-squares algorithm, a partial-update affine projectionsalgorithm, a partial-update block least-squares algorithm, a non-blindupdate weight adaptation algorithm, a partially blind update weightadaptation algorithm, a fully blind update weight adaptation algorithm,a carrier-timing tracking algorithm, a property-mapping algorithm, adominant-mode prediction algorithm, a self-coherence restoral algorithm,a QR decomposition algorithm, an eigenvalue-based algorithm, a datadimensionality reduction method, a QR-based method, or alinear-conjugate-linear algorithm.
 4. The method of claim 3, wherein theuncoupled multiport adapt-path weight update algorithm uses parallelbanks of single-port weight adaptation algorithms.
 5. The method ofclaim 1, wherein selecting comprises a set-selection strategy that isdeterministic, random, pseudo-random, or data-derived.
 6. The method ofclaim 1, wherein performing updates to the set of held weights within areduced-dimensionality subspace comprises updating adimensionality-reduction strategy.
 7. The method of claim 1, whereinperforming updates to the set of held weights is configured to employ anoptimization strategy of arbitrary type and structure.
 8. The method ofclaim 1, wherein performing updates to the set of held weights employsat least one of a set of optimization strategies, the set comprisingmaximum-likelihood estimation, maximum a priori estimation,minimum-mean-square estimation, an analytic constant modulus algorithm,a cumulant based algorithm, and an eigenvalue-based objective function.9. The method of claim 1, wherein the adaptive processor is configuredto perform at least one of echo cancellation and noise cancellation. 10.The method of claim 1, wherein the adaptive processor is configured toprocess signals in a phased array, a Multiple Input Multiple Output(MIMO) radar, a MIMO wireless network, a massive MIMO cellular network,or a predistortion processor.
 11. The method of claim 1, wherein theadaptive processor is configured to perform pattern recognition overfeature sets comprising a plurality of parameters.
 12. A non-transitorycomputer-readable medium having computer readable program code storedthereon, the computer readable program code being executable by at leastone adaptive processor for: selecting, from a set of weights, a set ofupdate weights and a set of held weights; performing unconstrainedupdates to the set of update weights; and performing updates to the setof held weights within a reduced-dimensionality subspace, whereinperforming updates to the set of held weights and performingunconstrained updates to the set of update weights uses adapt-pathoperations for tuning the adaptive processor to process signal dataduring or after tuning.
 13. The non-transitory computer-readable mediumof claim 12, wherein performing unconstrained updates to the set ofupdate weights and performing updates to the set of held weightscomprises constructing a set of enhanced weights from the set of updateweights and a projection of the set of held weights onto the reduceddimensionality subspace, and performing an unconstrained update to theset of enhanced weights.
 14. The non-transitory computer-readable mediumof claim 12, wherein performing unconstrained updates to the set ofupdate weights and performing updates to the set of held weightscomprises at least one of performing a single-port partial-updateadaptation algorithm, an uncoupled multiport adapt-path weight updatealgorithm, a fully-coupled multiport adapt-path weight update algorithm,a partial-update normalized least-mean-squares algorithm, apartial-update affine projections algorithm, a partial-update blockleast-squares algorithm, a non-blind update weight adaptation algorithm,a partially blind update weight adaptation algorithm, a fully blindupdate weight adaptation algorithm, a carrier-timing tracking algorithm,a property-mapping algorithm, a dominant-mode prediction algorithm, aself-coherence restoral algorithm, a QR decomposition algorithm, aneigenvalue-based algorithm, a data dimensionality reduction method, aQR-based method, or a linear-conjugate-linear algorithm.
 15. Thenon-transitory computer-readable medium of claim 12, wherein theuncoupled multiport adapt-path weight update algorithm uses parallelbanks of single-port weight adaptation algorithms.
 16. Thenon-transitory computer-readable medium of claim 12, wherein selectingcomprises a set-selection strategy that is deterministic, random,pseudo-random, or data-derived.
 17. The non-transitory computer-readablemedium of claim 12, wherein performing updates to the set of heldweights within a reduced-dimensionality subspace comprises updating adimensionality-reduction strategy.
 18. The non-transitorycomputer-readable medium of claim 12, wherein performing updates to theset of held weights is configured to employ an optimization strategy ofarbitrary type and structure.
 19. The non-transitory computer-readablemedium of claim 12, wherein performing updates to the set of heldweights employs at least one of a set of optimization strategies, theset comprising maximum-likelihood estimation, maximum a prioriestimation, minimum-mean-square estimation, an analytic constant modulusalgorithm, a cumulant based algorithm, and an eigenvalue-based objectivefunction.
 20. The non-transitory computer-readable medium of claim 12,further comprising computer readable program code that is executable bythe at least one adaptive processor to perform at least one of echocancellation and noise cancellation.
 21. The non-transitorycomputer-readable medium of claim 12, further comprising computerreadable program code that is executable by the at least one adaptiveprocessor to process signals in a phased array, a Multiple InputMultiple Output (MIMO) radar, a MIMO wireless network, a massive MIMOcellular network, or a predistortion processor.
 22. The non-transitorycomputer-readable medium of claim 12, further comprising computerreadable program code that is executable by the at least one adaptiveprocessor to perform pattern recognition over feature sets comprising aplurality of parameters.
 23. The non-transitory computer-readable mediumof claim 12, further comprising computer readable program code that isexecutable by the at least one adaptive processor to apply the set ofupdate weights and the set of held weights to the signal data to beprocessed in the signal estimation or parameter estimation algorithm,which outputs a signal estimate or a parameter estimate.